A.moharamkhani
Validity and the social dimension of language testing By Afsane Moharamkhani Instructor: Dr.Ahmadi Date: March 24, 2013 Abstract'' ''Messick defined validity as "an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment" (1989, p. 13, italics in original). The stance adopted by McNamara and Roever is that while recent work has been beneficial in providing conceptual frameworks for understanding test design and score interpretation, Messick raised questions about the social consequences of testing that others after him often leave unaddressed. '' ''Here we will consider the theories of validation that have most influenced current thinking in our field, in the work of Messick (following Cronbach) and his successors Mislevy and Kane, and its interpretation within language testing by Bachman, Chapelle, Lynch, Kunnan, Shohamy, and others.'' Introduction In what ways is the social dimension of language assessment reflected in current theories of the validation of language tests? Contemporary validity theory has developed procedures for supporting the rationality of decisions based on tests and has thus addressed issues of test fairness. However, although validity theory has also begun again to develop ways of thinking about the social dimensions of the use of tests, many issues are still unresolved, and in fact, it almost feels as if the ongoing effort to incorporate the social in this latter sense goes against the grain of much validity theory, which remains still heavily marked by its origins in the individualist and cognitively oriented field of psychology. Here we want to know that how the social dimensions affects the nature of testing and its result. ' ' '''Definition of important parts ' Validity ' To test what should be tested. It isn’t a mathematical property but a matter of judgment. '''Construct validity ' ž First: it must be defined in such a way that it becomes measurable. ž Second: any construct should be defined in such a way that it can have relationship with other construct that are different. ' ' 'Cronbach ' Current discussions of validity in education assessment are influenced by Cronbach who is called the father of construct validity. Cronbach and Meehl changed the name of construct validity to criterion-related validity and they mentioned the high central role of construct validity that subsumed other types of validity. They mentioned the strength and weaknesses of this validity. ž The weak one is haphazard evidence collection that supports a particular interpretation to be validated. ž Strong version is based on falsification idea advanced by Popperian philosophy. He never mentioned the importance of sociopolitical context and its influence on the whole testing. But later he understood that judgment of positive or negative results depend on social views of what is a desirable result, but they will change over time. 'Messick ' He performed a social dimension of testing clearly within his model. Messick like Cronbach, saw assessment as a process of reasoning and information gathering carried out based of individual inferences and saw the task of establishing the meaningfulness of these inferences as being the task of testing developed and research. He showed the social more concrete by discussing two things: ž Our understanding of what we measure and ž Things that we prefer to measure will reflect values to be social and cultural, and that tests have real effects in the educational and social contexts in which they are used. Messick put these aspects in the form of a matrix. As we said in figures provided by Messick there is no relationship between the test fairness and serial context of testing. For example tests such as IELTS, IDP, ESOL, TOEFLiBT, that are for deciding on if an international student should be admitted in an English educational environment or not. But this cannot be avoided that there is no way to be sure that the candidate will perform in the same way in none test settings. Then deciding whether the person will be admitted or on depends on two steps: ž Modeling the demands of target settings and ž Predicting the standing of the individual which is in relation to this construct. A test is a procedure for gathering evidence to make decisions about individuals so the evidence should be collected carefully and must be some relations between target settings, test and construct as in figure bellow: 'Mislevy ' The work of Mislevy and his fellows provides analytic clarity to the procedures involved in designing tests. According to Mislevy: An assessment is a machine for reasoning about what students know, can do, or have accomplished, based on a handful of things they say, do, or make in particular settings. (Mislevy, Steinberg, & Almond, 2003, p. 4) Mislevy does not consider the context in which tests are used and, thus, cannot problematize the determination of test constructs as a function of their role in the social and policy environment. 'Kane’s Approach to Test Score Validation ' Kane has also developed a systematic approach to thinking through the process of drawing valid inferences from test scores. Kane points out that we interpret scores as having meaning. The same score might have different descriptions. Kane pointed out that generalization across tasks is often poor in complex performance assessments: that because a person can handle a complex writing task involving one topic and supporting stimulus material it does not necessarily mean that the person will perform in a comparable way on another topic and another set of materials. If task generalizability is weak, or if the impact of raters or the rating process is large, then this “bridge” breaks down and we cannot move on in the interpretative argument. Kane thus distinguishes two types of inference (semantic inferences ''and ''policy inferences) and two related types of interpretation. Interpretations that only involve semantic inferences are called descriptive interpretations; interpretations involving policy inferences are called decision-based interpretations. ' ' 'The Social Dimension of Validity in Language Testing ' First Hymes (1967, 1972) talked about socially oriented assessment, but the most elaborated and influential discussion of the validity of communicative language tests, Bachman’s landmark Fundamental Considerations in Language Testing ''(1990), builds its discussion in significant part around Messick’s approach to validity. Bachman (1990) introduced his model of communicative language ability, reworking and clarifying the earlier interpretations by Canale and Swain (1980) and Canale (1983) of the relevance of the work of Hymes (1972) on first languages to competence in a second language. In communicative language testing, the target of test inferences is performance of a set of communicative tasks in various contexts of use. Bachman does this in two stages. First, he assumes that all contexts have in common that they make specific demands on aspects of test-taker competence. Although the terms in which communicative language ability is discussed include social dimensions such as sociolinguistic appropriateness, drawing on the ethnographic work of Hymes, the model is cognitive and psychological, a reflection of the broader traditions both of linguistics and educational measurement in which Bachman’s work is firmly located. The social context of the target language use situation depends on language user; there is a close relation between context and ability. The target language use situation is conceptualized in terms of components of communicative language ability, which, in turn, is understood as the ability to handle the target language use situation. The situation or context is projected onto the learner as a demand for a relevant set of cognitive abilities; in turn, these cognitive abilities are read onto the context. The theory of social context which is concerned with cognitive demand is absent here. Kane’s integration has certain advantages, in that it sees test use and test consequences as aspects of validity, although the distinction between semantic inferences and policy inferences upholds a separation that survives even into Messick’s matrix, where it reflects, as we have seen, the development of validity theory from its relatively asocial psychometric origins. In the language testing literature, there has been considerable research on test consequences under the headings of ''test wash back, usually referring to the effect on teaching in subject areas that will be examined, and test impact, referring to broader sorts of impacts in the school and beyond. Integrating studies of wash back and impact into a larger interpretative argument has the advantage of framing the significance of such studies more clearly in terms of investigating the policy assumptions involved in testing programs. ' ' 'The Broader Social Context: The Limits of Validity ' First Kunnan then Shohamy were concerned with social and policy issues. What should be mentioned here is that the context in which language tests are used is presented quite unproblematically; the responsibility of the tester is to make the “beneficial” information in the test as “useful” as possible. How can tests persist in being so powerful, so influential, so domineering and play such enormous roles in our society? One answer to this question is that tests have become symbols of power for both individuals and society. Bachman is right to suggest that it is not practical to attempt to incorporate the perspective of test takers as “political subjects in a political context” in an interpretative argument. If language testing research is to be more than a technical field, it must at the very least as a counterpoint to its practical activity develop an ongoing critique of itself as a site for the articulation and perpetuation of social relations. In this way, language testing research has a chance to move beyond the limits of validity theory and make a proper contribution to the wider discussion of the general and specific functions of tests in contemporary society. 'Conclusion ' The social dimensions affect test results and it is important to produce same situation in test environment and make a natural setting to help learners act the same way they act in testing and use what they learned in real situations. 'References ' ' ' Alderson, J.charles. &Clopham, C. & Wall, D. (1995).''Language test construction and evaluation. ''Great Britain. '''McNamara, T. ' & Roever, C. ('2006).'' The Social Dimension, Language Testing. Retrieved April 1, 2013 from the '''World Wide Web': http://tesl-ej.org/ej43/r9.html = = =Bedrick, S. (2006) Validity and the Social dimension of language testing. Retrieved March 19, 2013 from the World Wide Web: http://www.citeulike.org/user/stevenbedrick/article/3887090 = Young, R. Validity and the social dimension of language testing, Language Learning. Retrieved March 27 2013 from the World Wide Web: http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9922.2006.00379.x/abstrac McNamara, T. The social character of language tests, Widdowson, H. G. (ED)'' Language testing.'' Oxford university press.